17 research outputs found

    Towards human interaction analysis

    Get PDF
    Modeling and recognizing human behaviors in a visual surveillance task is receiving increasing attention from computer vision and machine learning researchers. Such a system should deal in particularly with detecting when interactions between people occur and classifying the type of interaction. In this work we study a flexible model for detecting human interactions. This has been done by detecting the people in the scene and retrieving their corresponding pose and position sequentially in each frame of the video. To achieve this goal our work relies on robust object detection algorithm which is based on discriminatively trained part based models to detect the human bodies in videos. We apply a ‘Gaussian Mixture Models based’ method for background subtraction and human segmentation. The output from the segmentation method which is labeled human body is combined with the background subtraction methods to obtain a bounding box around each person in images to improve the task of human body pose detection. To gain more precise pose detection models, we trained the algorithm on large, challenging but reliable dataset (PASCAL 2010). Our method is applied in home-made database comprising depth data from Kinect sensors. After successfully getting in every image sequence the corresponding label for each person as well as their pose and position, understanding of human motion comes naturally which is an important step towards human interaction analysis

    Towards social pattern characterization in egocentric photo-streams

    Get PDF
    Following the increasingly popular trend of social interaction analysis in egocentric vision, this article presents a comprehensive pipeline for automatic social pattern characterization of a wearable photo-camera user. The proposed framework relies merely on the visual analysis of egocentric photo-streams and consists of three major steps. The first step is to detect social interactions of the user where the impact of several social signals on the task is explored. The detected social events are inspected in the second step for categorization into different social meetings. These two steps act at event-level where each potential social event is modeled as a multi-dimensional time-series, whose dimensions correspond to a set of relevant features for each task; finally, LSTM is employed to classify the time-series. The last step of the framework is to characterize social patterns of the user. Our goal is to quantify the duration, the diversity and the frequency of the user social relations in various social situations. This goal is achieved by the discovery of recurrences of the same people across the whole set of social events related to the user. Experimental evaluation over EgoSocialStyle - the proposed dataset in this work, and EGO-GROUP demonstrates promising results on the task of social pattern characterization from egocentric photo-streams

    Social Signal Processing from Egocentric Photo-Streams

    Get PDF
    [eng] Wearable photo-cameras offer a hands-free way to record images from the camera- wearer perspective of daily experiences as they are lived, without the necessity to interrupt recording due to the device battery or storage limitations. This stream of images, known as egocentric photo-streams, contains important visual data about the living of the user, where social events among them are of special interest. Social interactions are proven to be a key to longevity and having too few interactions equates the same risk factor as smoking regularly. Considering the importance of the matter, there is no wonder that automatic analysis of social interactions is largely attracting the interest of the scientific community. Analysis of unconstrained photo-streams however, imposes novel challenges to the social signal processing problem with respect to conventional videos. Due to the free motion of the camera and to its low temporal resolution, abrupt changes in the field of view, in illumination condition and in the target location are highly frequent. Also, since images are acquired under real-world conditions, occlusions occur regularly and appearance of the people undergoes intensive variations from one event to another. Given a user wearing a photo-camera during a determined period, this thesis, driven by the social signal processing paradigm presents a framework for comprehensive social pattern characterization of the user. In social signal processing, the second step after recording the scene is to track the appearance of multiple people who are involved in the social events. Hence, our proposal begins by introducing a multi-face tracking which holds certain characteristics to deal with challenges imposed by the egocentric photo-streams. Next step forward in social signal processing, is to extract the so-called social signals from the tracked people. In this step, besides the conventionally studied social signals, clothing as a novel social signal is proposed for further studies within the social signal processing. Finally, the last step is social signal analysis, itself. In this thesis, social signal analysis is essentially defined as reaching an understanding of social patterns of a wearable photo-camera user by reviewing captured photos by the worn camera over a period of time. Our proposal for social signal analysis is comprised of first, to detect social interactions of the user where the impact of several social signals on the task is explored. The detected social events are inspected in the second step for categorization into different social meetings. The last step of the framework is to characterize social patterns of the user. Our goal is to quantify the duration, the diversity and the frequency of the user social relations in various social situations. This goal is achieved by the discovery of recurrences of the same people across the whole set of social events related to the user. Each step of our proposed pipeline is validated over relevant datasets, and the obtained results are reported quantitatively and qualitatively. For each section of the pipeline, a comparison with related state-of-the-art models is provided. A discussion section over the obtained results is also given which is dedicated to highlighting the advantages, shortcomings, and differences of the proposed models, and with regards to the state-of-the-art.[spa] Las cámaras portables ofrecen una forma de capturar imágenes de experiencias diarias vividas por el usuario, desde su propia perspectiva y sin la intervención de éste, sin la necesidad de interrumpir la grabación debido a la batería del dispositivo o las limitaciones de almacenamiento. Este conjunto de imágenes, conocidas como secuencias de fotos egocéntricas, contiene datos visuales importantes sobre la vida del usuario, donde entre ellos los eventos sociales son de especial interés. Las interacciones sociales han demostrado ser clave para la longevidad, el tener pocas interacciones equivale al mismo factor de riesgo que fumar regularmente. Teniendo en cuenta la importancia del asunto, no es de extrañar que el análisis automático de las interacciones sociales atraiga en gran medida el interés de la comunidad científica. Sin embargo, el análisis de secuencias de fotos impone nuevos desafíos al problema del procesamiento de las señales sociales con respecto a los videos convencionales. Debido al movimiento libre de la cámara y a su baja resolución temporal, los cambios abruptos en el campo de visión, en la iluminación y en la ubicación del objeto son frecuentes. Además, dado que las imágenes se adquieren en condiciones reales, las oclusiones ocurren con regularidad y la apariencia de las personas varía de un evento a otro. Dado que un individuo usa una cámara fotográfica durante un período determinado, esta tesis, impulsada por el paradigma del procesamiento de señales sociales, presenta un marco para la caracterización integral del patrón social de dicho individuo. En el procesamiento de señales sociales, el segundo paso después de grabar la escena es rastrear la apariencia de varias personas involucradas en los eventos sociales. Por lo tanto, nuestra propuesta comienza con la introducción de un seguimiento de multiples caras que posee ciertas características para hacer frente a los desafíos impuestos por las secuencias de fotos egocéntricas. El siguiente paso en el procesamiento de señales sociales es extraer las señales sociales de las personas bajo análisis. En este paso, adema´s de las señales sociales estudiadas convencionalmente, en esta tesis se propone la vestimenta como una nueva señal social para estudios posteriores dentro del procesamiento de señales sociales. Finalmente, el último paso es el análisis de señales sociales. En esta tesis, el análisis de señales sociales se define esencialmente como la comprensión de los patrones sociales de un usuario de cámara portable, mediante la revisión de fotos capturadas por la cámara llevada durante un período de tiempo. Nuestra propuesta para el análisis de señales sociales se compone de diferentes pasos. En primer lugar, detectar las interacciones sociales del usuario donde se explora el impacto de varias señales sociales en la tarea. Los eventos sociales detectados se inspeccionan en el segundo paso para la categorización en diferentes reuniones sociales. El último paso de la propuesta es caracterizar los patrones sociales del usuario. Nuestro objetivo es cuantificar la duración, la diversidad y la frecuencia de las relaciones sociales del usuario en diversas situaciones sociales. Este objetivo se logra mediante el descubrimiento de apariciones recurrentes de personas en todo el conjunto de eventos sociales relacionados con el usuario. Cada paso de nuestro método propuesto se valida sobre conjuntos de datos relevantes, y los resultados obtenidos se evalúan cuantitativa y cualitativamente. Cada etapa del modelo se compara con los trabajos relacionados más recientes. También, se presenta una sección de discusión sobre los resultados obtenidos, que se centra en resaltar las ventajas, limitaciones y diferencias de los modelos propuestos, y de estos con respecto al estado del arte

    Towards human interaction analysis

    No full text
    Modeling and recognizing human behaviors in a visual surveillance task is receiving increasing attention from computer vision and machine learning researchers. Such a system should deal in particularly with detecting when interactions between people occur and classifying the type of interaction. In this work we study a flexible model for detecting human interactions. This has been done by detecting the people in the scene and retrieving their corresponding pose and position sequentially in each frame of the video. To achieve this goal our work relies on robust object detection algorithm which is based on discriminatively trained part based models to detect the human bodies in videos. We apply a ‘Gaussian Mixture Models based’ method for background subtraction and human segmentation. The output from the segmentation method which is labeled human body is combined with the background subtraction methods to obtain a bounding box around each person in images to improve the task of human body pose detection. To gain more precise pose detection models, we trained the algorithm on large, challenging but reliable dataset (PASCAL 2010). Our method is applied in home-made database comprising depth data from Kinect sensors. After successfully getting in every image sequence the corresponding label for each person as well as their pose and position, understanding of human motion comes naturally which is an important step towards human interaction analysis
    corecore